Consider the the number of pigs slaughtered in Victoria, available in the
aus_livestockdataset.
- Use the
ETS()function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months.
## Series: Count
## Model: ETS(A,N,N)
## Smoothing parameters:
## alpha = 0.3221247
##
## Initial states:
## l[0]
## 100646.6
##
## sigma^2: 87480760
##
## AIC AICc BIC
## 13737.10 13737.14 13750.07
- Compute a 95% prediction interval for the first forecast using \(\hat{y} \pm 1.96s\) where \(s\) is the standard deviation of the residuals. Compare your interval with the interval produced by R.
auto_interval <- fc %>%
hilo() %>%
unpack_hilo("95%") %>%
select(4:7) %>%
slice(1)
auto_interval## # A tsibble: 1 x 5 [1M]
## .mean `80%` `95%_lower` `95%_upper` Month
## <dbl> <hilo> <dbl> <dbl> <mth>
## 1 95187. [83200.06, 107173.1]80 76855. 113518. 2019 sij
Let’s get manually those values:
sd_res <- fit %>%
augment() %>%
pull(.resid) %>%
sd()
auto_interval$.mean[1] + c(-1, 1) * (qnorm(0.975) * sd_res) ## [1] 76871.35 113501.77
Well, almost close …
Write your own function to implement simple exponential smoothing. The function should take arguments
y(the time series), alpha (the smoothing parameter \(\alpha\)) and level (the initial level \(\ell_0\)). It should return the forecast of the next observation in the series. Does it give the same forecast asETS()?
I’ve got an inspiration from this source. We are both getting the same values for this sub-exercise, but for the next one, this function yields better values.
exp_smoothing_manual <- function(y, arg_alpha, arg_ell_zero, bool_forecast_h1 = FALSE) {
if (bool_forecast_h1) {
total_len <- length(y) + 1
} else {
total_len <- length(y)
}
anti_alpha <- (1 - arg_alpha)
store_predictions <- rep(NA, total_len)
store_predictions[1] <- arg_ell_zero
for (i in seq_along(store_predictions)) {
if (i == 1) {
last_y <- store_predictions[i]
last_yhat <- store_predictions[i]
} else {
last_y <- y[i - 1]
last_yhat <- store_predictions[i - 1]
}
term_01 <- arg_alpha * last_y
term_02 <- anti_alpha * last_yhat
yhat <- term_01 + term_02
store_predictions[i] <- yhat
}
out <- yhat[length(yhat)]
return(out)
}
fit_model_pars <- coef(fit) %>%
select(term, estimate)
value_manual <- exp_smoothing_manual(
y = aus_pigs$Count,
arg_alpha = fit_model_pars$estimate[fit_model_pars$term == "alpha"],
arg_ell_zero = fit_model_pars$estimate[fit_model_pars$term == "l[0]"],
TRUE
)
value_auto <- fc$.mean[1]
value_manual == value_auto## [1] TRUE
Great!
Modify your function from the previous exercise to return the sum of squared errors rather than the forecast of the next observation. Then use the optim() function to find the optimal values of \(\alpha\) and \(\ell_0\). Do you get the same values as the
ETS()function?
Note: this source got value of 0.299 for alpha and 76379.265 for level. But, the code below is heavily modified and doesn’t get the values of the source. Anyways, correct values are:
true_values <- coef(fit) %>%
select(term, estimate)
true_values## # A tibble: 2 × 2
## term estimate
## <chr> <dbl>
## 1 alpha 0.322
## 2 l[0] 100647.
Let’s try to get optimal values:
optimize_exp_smoothing <- function(pars = c(arg_alpha, arg_ell_zero), y) {
out_predicted <- rep(NA, length(y))
for (i in seq_along(out_predicted)) {
out_predicted[i] <- exp_smoothing_manual(
y = y[1:i],
arg_alpha = pars[1],
arg_ell_zero = pars[2]
)
}
squared_errors <- (out_predicted - y) ^ 2
out <- sum(squared_errors) %>% sqrt()
return(out)
}
optim_pigs <- optim(
par = c(0.5, aus_pigs$Count[1]),
y = aus_pigs$Count,
fn = optimize_exp_smoothing,
method = "Nelder-Mead"
)
true_values %>%
mutate(my_values = optim_pigs$par,
pct_diff = (my_values / estimate) - 1)## # A tibble: 2 × 4
## term estimate my_values pct_diff
## <chr> <dbl> <dbl> <dbl>
## 1 alpha 0.322 0.322 -0.0000323
## 2 l[0] 100647. 99223. -0.0141
Very small differences. For \(\alpha\), I am practically getting the correct value, while for \(\ell_{0}\) (starting value), I missed the mark by 1.41%.
Combine your previous two functions to produce a function that both finds the optimal values of \(\alpha\) and \(\ell_{0}\), and produces a forecast of the next observation in the series.
exp_smooth_combine <- function(time_series) {
optim_series <- optim(
par = c(0.5, time_series[1]),
y = time_series,
fn = optimize_exp_smoothing,
method = "Nelder-Mead"
)
out <- exp_smoothing_manual(
y = time_series,
arg_alpha = optim_series$par[1],
arg_ell_zero = optim_series$par[2],
TRUE
)
return(out)
}
var_forecast <- exp_smooth_combine(aus_pigs$Count)
var_forecast## [1] 95186.57
Is this same as forecasted value from fable?
## My Value: 95,186.57 | model value: 95,186.56
Data set
global_economycontains the annual Exports from many countries. Select one country to analyse.
- Plot the Exports series and discuss the main features of the data.
china_exports <- global_economy %>%
filter(Country == "China") %>%
select(Year, Exports)
china_exports %>%
autoplot(Exports) +
labs(title = "China Exports") +
scale_x_continuous(n.breaks = 20)We can see that the China’s exports of goods and services (% of GDP) has positive trend until 2008. After 2008, massive drop is visible. In the context of time series, there is no seasonality present.
- Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.
- Compute the RMSE values for the training data.
## [1] 1.900172
- Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.
The trended model (right graph) shows stable and neutral level of exports during the forecast horizon. On the other hand, trended exponential smoothing model shows downward future trajectory. Prediction intervals for both methods show that there is considerable uncertainty in the future exports over the five-year forecast period.
Regarding what actually happened in 2018 and forward, this link is helpful and shows that for this case study, model on the right graph would have been more appropriate.
- Compare the forecasts from both methods. Which do you think is best?
## # A tibble: 2 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(Exports ~ error… Trai… 0.266 1.90 1.26 1.84 9.34 0.983 0.991 0.288
## 2 "ETS(Exports ~ error… Trai… -0.0854 1.91 1.25 -0.169 9.57 0.973 0.995 0.232
ETS(A, N, N) model is more accurate on training data,
but only slightly: * RMSE: 1.90 vs. 1.91. * MAPE: 9.43 vs. 9.57.
- Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.
For the solution, see the code snippet in Exercise 01, b.
Forecast the Chinese GDP from the
global_economydata set using an ETS model. Experiment with the various options in theETS()function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.
china_fit %>%
accuracy() %>%
arrange(MAPE)## # A tibble: 4 × 10
## .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(box_cox(GDP… Trai… -2.75e10 2.88e11 1.25e11 0.607 7.17 0.578 0.688 0.665
## 2 "ETS(box_cox(GDP… Trai… 6.35e 9 1.96e11 1.02e11 1.77 7.26 0.472 0.468 0.135
## 3 "ETS(box_cox(GDP… Trai… -6.22e10 3.08e11 1.32e11 0.118 7.36 0.611 0.735 0.670
## 4 "ETS(box_cox(GDP… Trai… 2.10e11 4.16e11 2.13e11 8.37 10.8 0.983 0.991 0.790
Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?
The multiplicative method is preferred when the seasonal variations are changing proportional to the level of the series, which is precisely the case here.
Making the trend damped improves the forecast!
Recall your retail time series data (from Exercise 8 in Section 2.10).
- Why is multiplicative seasonality necessary for this series?
The multiplicative method is preferred when the seasonal variations are changing proportional to the level of the series, which is precisely the case here.
- Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
- Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
## # A tibble: 2 × 12
## State Indus…¹ .model .type ME RMSE MAE MPE MAPE MASE RMSSE
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Northern T… Clothi… "ETS(… Trai… -0.00119 0.600 0.443 -0.265 5.21 0.506 0.517
## 2 Northern T… Clothi… "ETS(… Trai… 0.0582 0.602 0.449 0.499 5.23 0.512 0.520
## # … with 1 more variable: ACF1 <dbl>, and abbreviated variable name ¹Industry
## # ℹ Use `colnames()` to see all variable names
- Check that the residuals from the best method look like white noise.
## # A tibble: 2 × 5
## State Industry .model lb_stat lb_pv…¹
## <chr> <chr> <chr> <dbl> <dbl>
## 1 Northern Territory Clothing, footwear and personal acc… "ETS(… 0.663 0.415
## 2 Northern Territory Clothing, footwear and personal acc… "ETS(… 0.894 0.344
## # … with abbreviated variable name ¹lb_pvalue
Ljung-Box test shows significant p-value, meaning that we cannot reject \(H_{0}\) which tells us that the data is independently distributed (= white noise).
- Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?
## # A tibble: 2 × 12
## .model State Indus…¹ .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 "ETS(Tu… Nort… Clothi… Test 0.504 1.38 1.03 2.68 7.42 NaN NaN 0.599
## 2 "NAIVE(… Nort… Clothi… Test -4.66 5.39 5.00 -40.4 42.1 NaN NaN 0.262
## # … with abbreviated variable name ¹Industry
The ETS > NAIVE, quite clearly.
For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
Let’s do this!
optimal_lambda <- guerrero(myseries$Turnover) %>% unname()
myseries_boxcox <- myseries %>%
mutate(Turnover = box_cox(Turnover, optimal_lambda))
myseries_boxcox_dcmp <- myseries_boxcox %>%
model(STL(Turnover ~ trend(window = 12), robust = TRUE)) %>%
components() %>%
select(-.model)
myseries_boxcox_dcmp_tt_split <- time_series_split(myseries_boxcox_dcmp, nrow(myseries_boxcox_dcmp %>% filter(year(Month) <= 2010)))
myseries_boxcox_dcmp %>%
mutate(Category = ifelse(Month %in% myseries_boxcox_dcmp_tt_split$train$Month, "Train", "Test")) %>%
ggplot(aes(x = Month, y = season_adjust, color = Category)) +
geom_line() +
labs(title = "Seasonally adjusted data") +
scale_color_manual(values = c("red", "black"))myseries_boxcox_dcmp_fit_train <- myseries_boxcox_dcmp_tt_split$train %>%
model(ETS(season_adjust ~ error("A") + trend("A") + season("N")))
myseries_boxcox_dcmp_fc_test <- myseries_boxcox_dcmp_fit_train %>%
forecast(new_data = myseries_boxcox_dcmp_tt_split$test)
check_forecast_with_split_data(
object_fit = myseries_boxcox_dcmp_fit_train,
object_fc = myseries_boxcox_dcmp_fc_test,
object_data = myseries_boxcox_dcmp,
object_test = myseries_boxcox_dcmp_tt_split$test
)The results are even better!
Compute the total domestic overnight trips across Australia from the
tourismdataset.
- Plot the data and describe the main features of the series.
## trend_strength seasonal_strength_year seasonal_peak_year
## "0.94579" "0.74700" "1.00000"
## seasonal_trough_year spikiness linearity
## "3.00000" "12,063,155.58598" "9,641.45055"
## curvature stl_e_acf1 stl_e_acf10
## "11,857.46735" "-0.43415" "0.40279"
## acf1 acf10 diff1_acf1
## "0.72174" "2.57513" "-0.22286"
## diff1_acf10 diff2_acf1 diff2_acf10
## "1.19018" "-0.39996" "0.93099"
## season_acf1 pacf5 diff1_pacf5
## "0.68044" "0.82441" "0.61493"
## diff2_pacf5 season_pacf zero_run_mean
## "0.88030" "0.24853" "0.00000"
## nonzero_squared_cv zero_start_prop zero_end_prop
## "0.01010" "0.00000" "0.00000"
## lambda_guerrero kpss_stat kpss_pvalue
## "1.99993" "0.84674" "0.01000"
## pp_stat pp_pvalue ndiffs
## "-1.83097" "0.10000" "2.00000"
## nsdiffs bp_stat bp_pvalue
## "1.00000" "41.67324" "0.00000"
## lb_stat lb_pvalue var_tiled_var
## "43.25577" "0.00000" "0.05751"
## var_tiled_mean shift_level_max shift_level_index
## "0.84005" "2,415.32131" "2.00000"
## shift_var_max shift_var_index shift_kl_max
## "3,735,701.35753" "45.00000" "2.23865"
## shift_kl_index spectral_entropy n_crossing_points
## "75.00000" "0.49398" "26.00000"
## longest_flat_spot coef_hurst stat_arch_lm
## "3.00000" "0.97717" "0.83773"
- Decompose the series using STL and obtain the seasonally adjusted data.
- Forecast the next two years of the series using an additive damped trend method applied to the seasonally adjusted data. (This can be specified using decomposition_model().)
tourism_australia_dcmp_fit <- tourism_australia_dcmp %>%
model(ETS(season_adjust ~ error("A") + trend("Ad") + season("N")),
ETS(season_adjust ~ error("A") + trend("A") + season("N")))The c. exercise will be solved in d. part below.
- Forecast the next two years of the series using an appropriate model for Holt’s linear method applied to the seasonally adjusted data (as before but without damped trend).
- Now use ETS() to choose a seasonal model for the data.
- Compare the RMSE of the ETS model with the RMSE of the models you obtained using STL decompositions. Which gives the better in-sample fits?
## # A tibble: 3 × 3
## .model RMSE MAPE
## <chr> <dbl> <dbl>
## 1 "ETS(season_adjust ~ error(\"A\") + trend(\"A\") + season(\"N\"))" 789. 2.82
## 2 "ETS(season_adjust ~ error(\"A\") + trend(\"Ad\") + season(\"N\")… 790. 2.83
## 3 "ETS(Trips ~ error(\"A\") + trend(\"A\") + season(\"A\"))" 794. 2.86
Holt’s linear method is most precise for in-sample fits (training data).
- Compare the forecasts from the three approaches? Which seems most reasonable?
Honestly, the model that has the worst perfomance seems to give most reasonable forecast, since some seasonality is present in the original data.
- Check the residuals of your preferred model.
Everything seems fine.